Experimental Methods in Cognitive Psychology

Michael Dougherty

University of Maryland

All text and notes are copyrighted by Dr. Michael Dougherty

Reproduction of this material is prohibited without authorization by the author.

Chapter 1

The Science of Psychology

It is an inevitable topic of any and every psychology course that we discuss the nature of science. The question of what constitutes a science is as wrapped up in the psychology curriculum as Pavlov’s dogs or Freud’s theory of repression. But why are psychologists so concerned with the definition of “science”? Is it that we don’t believe our students haven’t heard it before? Or is that we feel the need to impress upon our students that psychology really is a science? Perhaps we hope that our students will spread the word to non-psychologists, and psychology will eventually be accepted as a science, just as biology, chemistry and physics our accepted as scientific fields. For whatever reason those before me have been so concerned with the definition of science, we too will begin with developing a definition of “science.”

Once upon a time I was a skeptic of the idea that psychology was a science. It was my lay impression that psychology dealt with people, and was mostly concerned with how to make people feel good or to cope, or whatever. Although I soon came to realize that psychology was a science, I still had the impression that psychology was not a full-fledged science: it was too different from biology, chemistry, and physics to be considered in the same breadth as those disciplines. After all, biology, chemistry and physics are hard sciences, and psychology did not seem to fit with the hard sciences. I now, of course, see why I didn’t view psychology as a fully-developed science. I was naïve, I didn’t really understand what psychology was about, and I certainly didn’t really understand what made something a science. I now view psychology every much a science as physics. Indeed, the idea that psychology is not as hard a science as physics and chemistry is even upsetting to me.

So what changed in my thinking over the years? How did I come to view psychology as a “hard” science? Part of my problem was that I let my naivety get the best of me. I knew physics and chemistry and biology were sciences, so I let myself define science by discipline: Certain disciplines were sciences and others simply weren’t. But obviously, defining science by discipline does little to help us recognize new sciences as they emerge. Perhaps we are inclined to use the goals of science to help us define science: sciences are those disciplines that deal with facts, or that have the goal of uncovering facts of nature. Or, perhaps sciences are those disciplines that include elaborate theories. However, if we rely solely on these criteria, we would be left including history and/or music among the sciences. Certainly, history and music are not to be considered sciences, so what sets them apart from the true scientific disciplines?

If there is one thing we can point to that can help us identify scientific disciplines it would be the process by which we search for facts. Sure, all sciences are concerned with facts, but it is the process by which these scientific disciplines search for facts that sets them apart from non-scientific disciplines, such as history, that search for facts. Modern sciences are remarkably similar with respect to the process of discovering facts – a process that we refer to as the scientific method.

We usually characterize the scientific method as consisting of four stages:

(a) Statement of hypothesis.

(b) Observation.

(c) Replicability

(d) Development of a law or theory to explain the observation

The scientific method isn’t a linear application of these four steps, instead it is an iterative application and refinement of the four steps. Advancement of science depends on being able to hypothesize, observe, replicate, refine theory, develop new hypotheses, observe, replicate, and refine theory, etcetera. The ultimate goal, of course, is to be able to explain the world in the most parsimonious, or simple, way. The scientific method is a process that allows us to examine our explanations, or theories, in a systematic fashion.

Theory.

Let us first start with a definition of what I mean by a theory. Theories come in many forms. Some are complex and others simple. Some are expressed verbally, and others are expressed mathematically. Some are based on casual observation, and others on formal experimental situations. If you’ve taken a course on social psychology, you know that people form stereotypes. For example, I used to have this theory that people who owned minivans were overly cautious people, because it seemed that every time I found myself behind one on the road it was going 10 mph below the speed limit. Stereotypes are theories individuals hold that are based on, and often perpetuated by, casual observation. I once overheard a nerdy couple looking a picture-hanging hooks in a hardware store discussing their “theory” about the type of material covered the wall so as to be able to choose the appropriate wall mounting screw. Their theories were no doubt based on their casual observation of the walls (were they plaster, wood, dry wall), but were no less theories than Einstein’s theory of relativity, albeit a little less elegant.

Although theories come in different varieties, all serve the same purpose: They are explanations of the world that enable us to understand seemingly complex relationships and make predictions. For me, my mini-van theory enabled me to predict how late I would be for work if I got behind a minivan. For the hardware store couple, they were using their theories to predict which type of mounting hardware would work best for their walls. Whether or not the predictions are accurate is an issue of how well the theory captures the essence of the world about which we’re theorizing.

Obviously, the problem with basing theories on informal observation is that we are prone to search out information that confirms our theory. This is well known in the psychological literature as the confirmation bias – the bias for searching for confirming evidence and ignoring potentially disconfirming evidence. As good example of the confirmation bias can be demonstrated with the Wason 2-4-6 task. In this task, the experimenter devises a rule that can be used to generate sequences of numbers. Suppose that I have a rule that can be used to generate the sequence of numbers “3 – 10 – 32”. Your task is to now try to figure our what rule I used to generate those numbers. In order to do so, you are given the opportunity generate sequences of 3 numbers. After each sequence, I tell you whether or not the sequence conforms to my rule. After you have produced 15 three-number sequences, you can try to guess my rule. In other words, you have 15 chances to test whether the rule you think I used to generate the numbers is the rule I actually used. Most people who do this task eventually figure out the correct rule (three numbers in ascending order), but only a few actually test the by trying to generate sequences that could potentially disconfirm the rule. People who figure out the rule tend to continue to generate sequences of digits in ascending order – sequences that would confirm the rule “three numbers in ascending order”. In fact, they are unwittingly confirming several theories simulataneously: “last digit must be greater than the first two numbers” and “the last digit must be greater than the second digit”, “last digit greater than first digit”, “last digit greater than at least one of the prior digits.” Few people actually recognize that to adequately test the “three digits in ascending order” hypothesis that they have to generate sequences that disconfirm that rule and that disconfirms the alternatives. Hence, they should generate “23-19-1”, “1 – 23 – 9”, and “23 – 1 – 9”. Note that our tendency to engage in the search for confirmatory evidence is what perpetuates stereotypes. My minivan theory (or stereotype) seems to gain support every time I find myself behind a slow moving minivan. The problem is, I don’t notice it so much when I’m behind a fast-moving minivan.

The world is a complex place, and theories are there to make this complex place simple to understand. Theories are not full-blown descriptions of all existing complex relations, but are simplified explanations of a subset of complex relationships. It is important to note that theories are nothing more than explanations of phenomena. It is also important to note that theories are meant to be simple in that they should try to take complex phenomena and boil them down to the essential components.

I stress the importance of simplicity because simple theories are easy to understand, and inasmuch as the goal of science is to understand the world, simple explanations of complex phenomena should be preferred. This property of theories is called parsimony, and is one assumptions of science: If a simple theory has as much explanatory power as a more complex theory, then preference is given to the simple theory. Simple theories are said to be parsimonious.

Hypothesis.

Theories should not only provide explanations of phenomena, but they should also be able to pose interesting research questions, or hypotheses. When we make predictions, we’re actually stating a hypothesis. A hypothesis is our best guess of what we expect to observe in any particular situation, experimental or otherwise. Hypotheses are often derived from formal theory, but can be derived from logic or prior observation (or what I call, informal theory). The formation of a hypothesis seems fairly straightforward, but a lot goes into making a hypothesis into one that is testable. A simple demonstration will help illustrate how this is so.

Suppose that I form the following hypothesis: Men are more likely to hold a door for a woman than for another man. There are a few things to note about my hypothesis. First, the scope of the hypothesis is with respect to door holding behavior among men. The hypothesis does not address what would happen among women. Thus, any empirical test of this particular hypothesis need not include observation of women door holders. (Of course we could modify this hypothesis however we see fit to include women door holders.) Second, the hypothesis does not define what I mean by “door-holding behavior”. This is something that must be settled prior to beginning any observation of door-holding behavior among men.

How we decide to define door-holding behavior is crucial to our test of the hypothesis, and is not a trivial task. If a group of us were to monitor a door to the psychology building for a few hours, I suspect there would be little agreement on what behaviors constituted door holds. There would surely be those cases in which door holding was obvious and indisputable, such as when one pulls open the door, holds it with their hand while letting a second person enter before them. But there would be plenty of disagreements too. For example, should we count those cases in which the potential door-holder releases the door before the other person completely crosses the entrance? How should we count those cases in which the potential door holder opens the door wide enough for a second person to enter (with or without looking at the person behind them)? What about those cases where the potential door holder holds the door open only long enough for the holdee to grab the handle for themselves? In short, what constitutes a door hold?

Not only do we need to be able to detect instances of door holding behavior, we also need to be able to detect non-occurrences. Thus, we will need to be able to decide when two or more people constitute a legitimate observation. How close in proximity do two or more people need to be before we count them as an observation? Note that what we’re defining is what constitutes a “dyad” – a pair of individuals whom we classify as a legitimate observation. Certainly we would not expect someone to hold the door for another unless that person is somewhat close to the potential holder, but how close does the pair need to be before we can consider it a non-occurrence, 100 yards, 10 yards, 2 feet? We must be in total agreement about what constitutes a “dyad” and what constitutes a door hold. In other words, we need to operationalize these constructs so we can carry out our study in a systematic manner and so that anybody who wishes to, can replicate our study.

Take a few moments to think about how you would decide what constitutes a dyad, and what constitutes a door hold. Write them down here and compare them with your classmates.

Our definitions of this type are called operational definitions. Any construct or variable that we wish to measure, observe, or manipulate needs to be operationalized. Operational definitions are what we use to define the constructs we wish to study. For example, suppose that we want to know the effect of massages on stress. In this case we must be able to operationalize the massage (exactly how will the massage be done, what technique, will there be music present, etc.), and we must operationalize how we plan to measure stress. Stress cannot be measured directly, so we must use a proxy, such as galvanic skin response, heart rate, blood pressure, pupil dilation, finger tapping rate, subjective ratings, etc. How we decide to measure stress level will undoubtly affect our results and/or conclusions, since some measures will be better measures of stress than others.

It is easy to find examples where the failure to operationalize variables has had serious consequences. A case in point is the 2000 presidential elections. Florida law stated that votes were to be counted as long as the voter showed a clear intention of their vote (or whatever the exact wording of the phrase was). But, what do we mean by “clear intention”. The failure to operationalize “clear intention” in terms of the methodology to collect votes, in this case paper cards with punch holes, led to disagreements as to which votes should be counted. Is clear intention marked by a dimpled punch hole (or chad) or a completely punched chad? How should you count cards where the two of the four corners of the chad were broken, of where three of the four corners were broken? Typically these distinctions would be of no consequences in an election, but when the election is determined by a handful of votes the distinction becomes terribly important.

How we end up operationalizing our variables is part and parcel of the idea behind construct validity. Some operationalizations of our variables are more closely tied than others to the underlying constructs our theory. Suppose we have a theory that says that caffeine consumption is related to stress levels, and that we run a study and find a relationship between caffeine intake and stress, where stress is measured in terms of finger-tapping speed. What sort of claims can we make with respect to our theory? Can we say that there is a relationship between caffeine and stress? Maybe, but perhaps the more accurate statement to make is that there is a relationship between caffeine and our measure of stress, finger-tapping speed. To the extent that finger-tapping speed is actually a measure of stress levels is an issue of construct validity. We will return to the notion of validity in the next chapter. For now, suffice it to say that our operationalizations place constraints on what sort of theoretical claims we can make.

Observation.

Once the hypothesis is formed, and our variables operationalized, we must test the hypothesis – that is, we need to make observations. Observations are done using formal methodologies, which will be introduced in the next chapter. Methodologies can be divided into four general classes: Controlled (or True) experiments, Quasi-experiments, survey, and naturalistic observation. Controlled or true experiments are those observations that take place in the lab under strict laboratory controllers – they are perhaps what we think of when we think of science. The hallmark of a true experiment is that participants are randomly assigned to conditions.

Quasi-experimental designs involve the comparison between naturally occurring variables. For example, any comparison between males and females, between different ethnic groups, or between different age groups, are quasi-experimental because people cannot be randomly assigned to these naturally occurring groups.

Surveys are observations that are made to discover or explore the relationship among variables. Surveys are often quasi-experimental, in that the data that are obtained are frequently correlational data.

Naturalistic observations are observations that we make of the phenomenon in its’ naturally occurring environment.

All four methods serve the purpose of formalizing how we make our observations, but they don’t all allow the same type of conclusions. For example, inferences about causation can only be made using a true experiment. Naturalistic observations don’t allow us to infer causation but do allow us to explore the phenomena of interest in the setting it naturally takes place. Surveys are most often used to identify correlational relationships, and most frequently do not allow causal inference.

Replicability.

Replication is as important to the scientific process as theory and observation. After all, we can’t build theory without being confident that the results we obtain in an experiment are reliable. So we replicate. Replication can achieve multiple goals. The first goal it achieves is that it increases our confidence that the effect we observed is in fact real. Suppose we run the same experiment 100 times. Out of this 100 times, we obtained an effect exactly 5 times. How much confidence would you have in this experiment? Probably not much and certainly not as much as an experiment that replicates 99 out of 100 times. Note that finding an effect five times out of 100 chances is exactly what we would expect if there were no real effect present, and we use a significance value of a = .05 in our statistical tests. That is, inherent statistical properties present in the world will ensure that we erroneously observe an effect of our variable some of the time, even if the true state of affairs is that there is no effect.

Remember that thing about type I error rate you learned in your statistics class? Well, replication helps keep us from committing type I errors – that is, it keeps us from falsely concluding that a null hypothesis false, when it is actually true. In this sense, replication is one way of protecting ourselves against violations of a second type of validity – statistical conclusions validity.

A second goal of replication is to test the robustness or generality of the effect. Does the variable of interest have an effect across a variety of populations? Do we find the same effect with the elderly as we do with college students? Does the variable have an effect when we modify the task? All these questions are answered through replication. Note that answer to these questions address yet a third type of validity – external validity.

Develop law or theory.

Developing a law or theory is both the most challenging and the most enjoyable part of science. It’s challenging because the theory must be specific enough to provide specific empirical predictions, yet be general enough to summarize existing datasets. It’s the most enjoyable because theories are our lens through which we view the world. Without theories, we would be left explaining the world with the very observations we’re trying to explain.

Take for example serial position effect in free recall. In the typical serial position experiment, participants study a list of say 20 words. The words are presented one at a time to participants at a constant rate (e.g., each word is presented for 4 seconds). In one variant of this task, participants are asked to recall as many words as they can immediately after the last word is presented. The recall task merely requires participants to output the words, regardless of order. This task is repeated several times for each participant. Note that some words are presented towards the beginning of the list, some towards the middle, and others just prior to participants being asked to recall. In examining serial position effects, the experimenter is interested in knowing: Of those words that were presented first in the serial order, how many were recalled? Of those words that were presented 2^nd, how many were recalled? Etc. By tabulating the results this way, we can get a sense of how well participants are able to recall as a function of the serial position of the words on the list.

Typical serial position effects are shown in the figure below. The y-axis plots the mean percent correct for each serial position. The x-axis plots the serial position.

Two things should be apparent from this graph. First, recall is never perfect. Second, words at the beginning and end of the list show better recall than words in the middle serial positions. The enhanced recall for the beginning of the list is referred to as a primacy effect and the enhanced recall for the end of the list is referred to as a recency effect.

It is important to distinguish between effects and theories. Effects are descriptive labels we give to empirical phenomena. Theories are explanations of why the effects obtained. If I asked to you to give a theoretical explanation of serial position effects, you might start by telling me that recall is better for words at the beginning and the end of a ordered list relative to the middle. This is an accurate description of the data. Primacy and recency are not theoretical accounts of this description; they are merely labels that we use to summarize the empirical finding. The theoretical account is directed at answering the question “Why is recall better for words at the beginning and end of an ordered list?” Or, “Why do primacy and recency effects obtain?”

Obviously, to account primacy and recency effects we need to postulate a system that can plausibly produce these effects. As we will discuss later in this course, one such theory is the modal model. This theory postulates two different memory stores, a short-term memory (STM) and a long-term memory (LTM), along with a few assumptions about what these stores are responsible. Not only does this theory provide an account of primacy and recency, but it also provides new predictions that can be tested to further substantiate the theory.

It’s important to point out that theories are nothing more than explanations that explain our observations. Ideally, these theories are parsimonious – they are as simple as possible. Why should we prefer simple explanations to complex explanations? One reason is that we want to make a few assumptions as necessary to account for the data. A second reason is that simple explanations are easier to construct, and easier to understand. That’s not to say that we shouldn’t construct complex theories, but merely to say that we should start as simple as possible. As a rule, we start simple and add to our theory only when the data necessitate it. Parsimony is one property of a good theory.

The second property of a good theory is that it can be tested, and potentially disconfirmed. That is, if a theory makes such a wide range of predictions that it can account for any possible result, it can’t be tested. No matter how the experiment is designed, the outcome will be consistent with the theory, regardless of whether the theory is actually correct. When theories do not allow for the possibility of being incorrect (i.e., they predict any conceivable result), they are said to lack falsifiability.

Falsifiability relates to two philosophical views of theory testing: Confirmationism and falsificationism. Confirmationism states that theories gain credence when experiments confirm their predictions. Thus, the more experiments that are done that confirm a theory, the more confidence we can hold in the accuracy of the theory.

Confirmationism, however, only provides evidence consistent with the theory that is being evaluated and tells us nothing about the space of possible alternative theories. As an example, suppose there are three theories that logically account for a single experimental result. Suppose we now run a new experiment, and the results again confirm all three theories. Should our confidence in any particular theory increase as a function of this experiment. No! However, suppose that we instead design an experiment that, if it comes out one way, confirms one theory and disconfirms another. In this case, we are justified in increasing our confidence in the theory that is not disconfirmed.

The idea of designing experiments that can disconfirm a theory stems from falsificationism. The basic premise of falsificationism is that confidence in any particular theory should increase by virtue of eliminating alternatives and by setting up experiments that, if they come out one way will disconfirm the theory.

If we cannot design an experiment that can distinguish between competing theories, the competitors would be said to lack identifiability.

To return to the topic of properties of good theories, our second (and necessary) property is that: Good theories are coherent, both logically and empirically. That is, theories should make sense logically – they should not be able to make two contradictory predictions. Moreover, good theories should allow for the possibility that they will be disconfirmed – can an experiment be designed that could potentially produce results outside the boundaries of the theory’s predictions? This does not mean that data will ever actually be observed that will disconfirm the theory, just that the theory’s range of predictions are not so wide that it can account for every plausible result.

Theories that satisfy logical coherence are good because they are falsifiable – they make predictions that, if disconfirmed, would render the theory incorrect or incomplete. Falsifiability is an important property of a theory. If the theory can account for several contradictory empirical findings simultaneously, then it may as well not make any predictions at all. That is, it a theory can account for all possible outcomes of all possible experiments, it lacks falsifiability. If it lacks falsifiability, then it cannot be tested: What does it mean to confirm a theory if the outcome of the experiment is predicted by the theory regardless of how the experiment comes out?

A third property of good theories is that they both explain and predict. We want theories that can explain all of our past observations, but we also want to be able to use our theories to make new predictions that can be tested empirically. The ability to make new, testable predictions means that there is the potential for the theory to be falsified. If the new predictions cannot be supported empirically, we can conclude that the theory is somehow flawed, and is in need of modification (or even abandonment). If the new predictions are supported empirically, then the theory gains momentum. Theories that survive repeated attempts to be falsified may eventually be accepted as law.

Think about how the modal model might be able to predict new experimental results.

Assumptions of Science

All scientific endeavors make at least four assumptions. One assumption is that the world and nature are lawful – that is, events are ordered and predictable. In the absence of lawfulness, events would be random, and experimentation impossible (or at least unfruitful).

The second assumption is that the world is a deterministic place. Events in the world and nature have causes. Things happen as the result of a causal force. In physics and chemistry, true determinism is possible. That is, every time A occurs, it causes B. However, we need not be so rigid in how we think of determinism. We can think about determinism in the probabilistic sense: If A, then the likelihood of B increases. In fact, probabilistic determinism is the norm in most sciences, especially in the behavioral and medical sciences. For example, we all accept the fact that smoking causes cancer. However, the likelihood that one develops cancer as a result of smoking is less 100%. That is, there is a probabilistic relationship between smoking and cancer. What we’re interested in is whether variable A increases or decreases the likelihood of variable B.

The third assumption that we make is that empiricism is possible. Lawfulness and determinism do us little good if we cannot observe or measure the phenomena of interest.

The fourth assumption is that of parsimony. The assumption of parsimony simply states that we prefer simple explanations or theories to complex explanations. If a simpler explanation of the phenomenon is sufficient to account for the data, then it is preferred to the less simple explanation. Parsimony allows easy extrapolations of our findings.

Measurement in Science

Measurement is paramount to science. We cannot study phenomena that cannot be measured. One can think of two general classifications of measurement: Direct and indirect. Direct measurement involves being able to directly observe, through our senses, the phenomena to be observed. For example, with a powerful enough telescope, Copernicus could directly measure the motion of the planets in our solar system. Chemists can measure the temperature and speed of chemical reasons and the change in molecular structure and mass. Astronomers often rely on pictures, and samples, taken by satellites or probes (such as the Mars Observer). And, for a time, during the Behaviorist movement, American psychology relied almost exclusively on direct measurement to assess and describe behavior.

However, direct measurement cannot be used in all observations – when direct measurement isn’t possible, we look for indirect measurements. Indirect measurements are proxies for direct measurements – they tap the residue leftover from the phenomena. Sometimes we can’t directly observe the phenomena of interest, but we can observe its by-products.

Case in point: the last several years witnessed a boon for astronomy, and the discovery of new planets. These new discoveries have been made, not by directly spotting the planets through light or radio telescopes (many of these new planets are far too small to be seen though a light telescope), but by identifying their “signatures”. Radio telescopes work my measuring the residue left behind by physical objects using x-rays. X-rays bounce off and permeate different elements, and compounds in different ways. (e.g., X-rays can tell the difference between bone and soft tissue, but can also tell us the difference between hydrogen gas and nitrogen gas). Relying on rather simple laws of physics, it is possible to identify a new planet by looking for its trace. When a massive object circles a star, its produces gravitational tugs and pulls on the star’s gaseous surface. All this pulling and tugging produces a slight wobble in the stars’ surface. We don’t see the planet itself, but we do see the residual effects of the planet – its effect on the gaseous surface of the star. Thus, by measuring the extent of the gravitational pull and the regularities (or lawfulness) of the wobble, astronomers can calculate: a) whether a planet is present, b) its approximate mass, and c) its orbit.

A lot of what we want to observe in psychology is not directly observable. We can’t peer into the minds of our participants to see what or how they are thinking. We can, however, measure the residual effects of their mental processes. We can’t see what happens to the memory trace as a result of having you use an elaborative rehearsal strategy. But we can measure the residual effects of elaborative rehearsal: that which is left over as a result of elaborative rehearsal. We will discuss at length paradigms in which we can learn about the properties and functioning of the mind by tapping into the by-products of the mental functions.

A good example of the use of indirect measurement in cognitive psychology is illustrated by a paradigm developed by George Sperling in the 1960’s. Sperling was interested in the capacity of what we call Iconic memory – how much information the visual system can hold. In cognitive psychology, we typically think of at least three types of memory: Very short-term sensory memory, short-term/working memory, and long-term memory. Sensory memory is assumed to last no more than 1-2 seconds, it’s that very brief interval in which you first perceive information before you become conscious of it. In the case of iconic memory, one might think of it as visual persistence – the visual image that is retained on the retina after sensing an object. Short-term memory is typically thought of as lasting no longer than 30 seconds, and long-term memory is your relatively permanent record of past events. One can think of short-term memory as that information that you are current consciously aware of.

In Sperling’s case, he was interested in the capacity of iconic memory – how much information the visual system could code. However, he faced a special problem in investigating this phenomenon. How do we measure how much information can be held in sensory memory? In his initial attempts, Sperling developed what is now called the full-report procedure. Participants are flashed a three x four array of letters and then required to report the letters (after a delay of 0-5 seconds) in that array verbally. A typical trial of this procedure is given below:

Fixation

0

(500 msec)

C F P Y

J M B X

S G R L

(50 Msec)

(Blank)

(0 – 5 sec)

(Blank)

Report letters

The results from this procedure can give us two types of data. First, we can examine how much information is held in iconic memory by the number of items reported. Second, we can examine how long that information is retained in iconic memory. For example, if participants can now longer report the contents of the array after a few seconds, this would suggest that the contents of iconic memory had faded. Results using the full-report procedure revealed that participants could accurately report around 37% of the letters in the array, suggesting that the capacity of iconic memory was limited to 4.5 items. In addition, accuracy decreased markedly after only a few seconds, suggesting that information remains active in iconic memory for only a very brief period of time.

· Can you think of why the Full report procedure might give us an inaccurate estimate of the capacity of sensory memory? To answer this question, it might help to consider the methodology in the context of our assumption of the duration of sensory memory.

Note that the full report has a built in confound. If the duration of iconic memory is really only a few seconds, and participants have to report verbally the content of their memory, and the reporting of the letters takes time, then it is possible that participants’ iconic memory is holding more than they can report. For example, suppose that it takes 500 msec to report each letter. By the time participants report 4 letters, 2 seconds have elapsed. Thus, by the time participants report 4 letters, anything else that may have been encoded in iconic memory would have decayed. Perhaps iconic memory has a larger capacity than can be measured using the full-report procedure.

To address this potential concern, Sperling developed a second method, called the partial report method. The same basic paradigm was used, except that prior to being asked to report, participants were instructed on which row was to be reported. The instruction was really just an auditory tone: High pitched tones signaled the top row, medium pitched tones the middle row, and low pitched tones the bottom row. Note that the instruction to retrieve a particular row will not affect what participants try to study in the study array since they do not know which row they will have to report until just prior to the report.

Partial report procedure: High, medium, low tones cued different rows in the array.

0

(500 msec)

C F P Y

J M B X

S G R L

(50 Msec)

(Blank)

(0 – 5 sec)

Tone cue for report

(Blank)

Report letters

Using this new procedure, Sperling revealed that participants could retrieve 3-4 letters (or 76%) from each row. By extension, then, Iconic memory must be able to store around 9 – 10 letters, or 75% of the 12 letter display. The ultimate conclusion of this line of research was based on the results of the partial report procedure.

What do the full and partial report procedures tell us about measurement in cognitive psychology? First, we are inferring capacity using an indirect measure – the number of items that can be retrieved. Note Second, we are inferring duration using an indirect measure – how long it takes for those items to decay from iconic memory. Finally, we see that the methodology that is used in the measurement process affects our conclusions. If Sperling were to have not developed the partial report procedure, the conclusions would have been dramatically different. Note that the methodology we used to measure of the underlying construct (capacity of iconic memory) is tied to our theory. Our measurements are no better than our methods – if our methods are tainted or flawed, our measurement is tainted or flawed, and as a result, our scientific conclusions may be inaccurate.

Observation through simulation.

Sometimes the systems we want to observe are not easily observed as a whole, so we develop Simulation models. Computer simulation models allow us to predict events or pattern of events by instantiating certain assumptions within the context of a computer program. Simulations of ocean currents use assumptions based on topographical data and sea temperatures. Simulations of tidal waves given underlying topographical data, fault lines in seafloor, assumptions about the force of certain types of underwater landslides and the effect of these landslides on water movement (again using empirical data to derive assumptions).

One of my favorite simulation models was constructed to describe the behavior of bubbles in beer:

Example: Why do some bubbles in Guinness Stout sink rather than rise? Simulations of Guinness Stout (a group of scientists at the University of New South Wales in Sydney, Australia) reveal that it has to do with the size of the bubbles. Bubbles smaller than 50 – 60 nanometers in size have too little buoyancy and momentum to resist the downward currents. Apparently it also has something to do with the shape of the glass in which the beer is poured: The barrel-chested Guinness glass is the best for demonstrating the effect. The simulation model is based on fluid dynamics. No actual observations were made, just a simulation.

Psychology uses simulation and mathematical models to help inform us about behavior, and this is particularly the case in cognitive psychology. Undoubtedly, human thought processes are one of the most complex entities to which to apply the scientific method. This was even noted by Albert Einstein who remarked that the mind would be the last universe to be completely understood. Given its complexity, it is often difficult to anticipate the dynamics of the theories we develop to account for thought processes. Indeed, it is often the case that even the originator of a theory does not completely understand the full force of their theoretical statements. Thus, one method we can use to help us explore our theories consists of building simulation models that instantiate our theories in mathematical form, which can then be used to simulate or derive predictions about human behavior. Simulation models are, therefore, specific instantiations of our theory. In order to make a simulation work, we must specify the theory at a level of detail that enables us to write mathematical statements.

Simulation models serve several purposes, but the most important among them is that they enable us to explore the expected relationship among variables as if our theory were correct – they give us a way to observe the behavior without really observing behavior. Thus, simulation models can be an enormous help in guiding experimentation. In this sense, simulation models serve a heuristic value, as our choice of what experiments to develop can be informed by our theory.

The realism of models. Models by their very nature are simplifications of the real world. Just like a model airplane is simplification of a real airplane, scientific models are simplified representations of the world. Thus, there is nothing real about a model at all, except that it is intended to capture the essence of some aspect of the real world. This is an important point to keep in mind as you encounter models of cognition in this class. Regardless of their apparent sophistication, all models are false because they are simplified versions of the world they intend to represent! Mathematically specified models are no more realistic than non-mathematical models. Moreover, just because a model provides good fits to the data is not, in itself, a good reason to believe the model is correct. Indeed, a model can be fundamentally incorrect, yet fit data well.

Summary.

You fill in the blanks here.